Serveur d'exploration sur la recherche en informatique en Lorraine

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

Missing data mask estimation with frequency and temporal dependencies

Identifieur interne : 003B74 ( Main/Exploration ); précédent : 003B73; suivant : 003B75

Missing data mask estimation with frequency and temporal dependencies

Auteurs : Sébastien Demange [France] ; Christophe Cerisara [France] ; Jean-Paul Haton [France]

Source :

RBID : Francis:09-0009467

Descripteurs français

English descriptors

Abstract

Automatic speech recognition (ASR) has reached a very high level of performance in controlled situations. However, the performance degrades drastically when environmental noise occurs during recognition. Nowadays, the major challenge is to reach a good robustness to adverse conditions. Missing data recognition has been developed to deal with this challenge. Unlike other denoising methods, missing data recognition does not match the whole data with the acoustic models, but instead considers part of the signal as missing, i.e. corrupted by noise. The main challenge of this approach is to identify accurately missing parts (also called masks). The work reported here focuses on this issue. We start from developing Bayesian models of the masks, where every spectral feature is classified as reliable or masked, and is assumed independent of the rest of the signal. This classification strategy results in sparse and isolated masked features, like the squares of a chess-board, while oracle reliable and unreliable features tend to be clustered into consistent time-frequency blocks. We then propose to take into account frequency and temporal dependencies in order to improve the masks' estimation accuracy. Integrating such dependencies leads to a new architecture of a missing data mask estimator. The proposed classifier has been evaluated on the noisy Aurora2 (digits recognition) and Aurora4 (continuous speech) databases. Experimental results show a significant improvement of recognition accuracy when these dependencies are considered.


Affiliations:


Links toward previous steps (curation, corpus...)


Le document en format XML

<record>
<TEI>
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en" level="a">Missing data mask estimation with frequency and temporal dependencies</title>
<author>
<name sortKey="Demange, Sebastien" sort="Demange, Sebastien" uniqKey="Demange S" first="Sébastien" last="Demange">Sébastien Demange</name>
<affiliation wicri:level="3">
<inist:fA14 i1="01">
<s1>LORIA, UMR 7503</s1>
<s2>Nancy</s2>
<s3>FRA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
</inist:fA14>
<country>France</country>
<placeName>
<region type="region">Grand Est</region>
<region type="old region">Lorraine (région)</region>
<settlement type="city">Nancy</settlement>
</placeName>
</affiliation>
</author>
<author>
<name sortKey="Cerisara, Christophe" sort="Cerisara, Christophe" uniqKey="Cerisara C" first="Christophe" last="Cerisara">Christophe Cerisara</name>
<affiliation wicri:level="3">
<inist:fA14 i1="01">
<s1>LORIA, UMR 7503</s1>
<s2>Nancy</s2>
<s3>FRA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
</inist:fA14>
<country>France</country>
<placeName>
<region type="region">Grand Est</region>
<region type="old region">Lorraine (région)</region>
<settlement type="city">Nancy</settlement>
</placeName>
</affiliation>
</author>
<author>
<name sortKey="Haton, Jean Paul" sort="Haton, Jean Paul" uniqKey="Haton J" first="Jean-Paul" last="Haton">Jean-Paul Haton</name>
<affiliation wicri:level="3">
<inist:fA14 i1="01">
<s1>LORIA, UMR 7503</s1>
<s2>Nancy</s2>
<s3>FRA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
</inist:fA14>
<country>France</country>
<placeName>
<region type="region">Grand Est</region>
<region type="old region">Lorraine (région)</region>
<settlement type="city">Nancy</settlement>
</placeName>
<placeName>
<settlement type="city">Nancy</settlement>
<region type="region" nuts="2">Grand Est</region>
<region type="region" nuts="2">Lorraine (région)</region>
</placeName>
<orgName type="laboratoire" n="5">Laboratoire lorrain de recherche en informatique et ses applications</orgName>
<orgName type="university">Université de Lorraine</orgName>
<orgName type="institution">Centre national de la recherche scientifique</orgName>
<orgName type="institution">Institut national de recherche en informatique et en automatique</orgName>
</affiliation>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">INIST</idno>
<idno type="inist">09-0009467</idno>
<date when="2009">2009</date>
<idno type="stanalyst">FRANCIS 09-0009467 INIST</idno>
<idno type="RBID">Francis:09-0009467</idno>
<idno type="wicri:Area/PascalFrancis/Corpus">000296</idno>
<idno type="wicri:Area/PascalFrancis/Curation">000735</idno>
<idno type="wicri:Area/PascalFrancis/Checkpoint">000236</idno>
<idno type="wicri:explorRef" wicri:stream="PascalFrancis" wicri:step="Checkpoint">000236</idno>
<idno type="wicri:doubleKey">0885-2308:2009:Demange S:missing:data:mask</idno>
<idno type="wicri:Area/Main/Merge">003C70</idno>
<idno type="wicri:Area/Main/Curation">003B74</idno>
<idno type="wicri:Area/Main/Exploration">003B74</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title xml:lang="en" level="a">Missing data mask estimation with frequency and temporal dependencies</title>
<author>
<name sortKey="Demange, Sebastien" sort="Demange, Sebastien" uniqKey="Demange S" first="Sébastien" last="Demange">Sébastien Demange</name>
<affiliation wicri:level="3">
<inist:fA14 i1="01">
<s1>LORIA, UMR 7503</s1>
<s2>Nancy</s2>
<s3>FRA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
</inist:fA14>
<country>France</country>
<placeName>
<region type="region">Grand Est</region>
<region type="old region">Lorraine (région)</region>
<settlement type="city">Nancy</settlement>
</placeName>
</affiliation>
</author>
<author>
<name sortKey="Cerisara, Christophe" sort="Cerisara, Christophe" uniqKey="Cerisara C" first="Christophe" last="Cerisara">Christophe Cerisara</name>
<affiliation wicri:level="3">
<inist:fA14 i1="01">
<s1>LORIA, UMR 7503</s1>
<s2>Nancy</s2>
<s3>FRA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
</inist:fA14>
<country>France</country>
<placeName>
<region type="region">Grand Est</region>
<region type="old region">Lorraine (région)</region>
<settlement type="city">Nancy</settlement>
</placeName>
</affiliation>
</author>
<author>
<name sortKey="Haton, Jean Paul" sort="Haton, Jean Paul" uniqKey="Haton J" first="Jean-Paul" last="Haton">Jean-Paul Haton</name>
<affiliation wicri:level="3">
<inist:fA14 i1="01">
<s1>LORIA, UMR 7503</s1>
<s2>Nancy</s2>
<s3>FRA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
</inist:fA14>
<country>France</country>
<placeName>
<region type="region">Grand Est</region>
<region type="old region">Lorraine (région)</region>
<settlement type="city">Nancy</settlement>
</placeName>
<placeName>
<settlement type="city">Nancy</settlement>
<region type="region" nuts="2">Grand Est</region>
<region type="region" nuts="2">Lorraine (région)</region>
</placeName>
<orgName type="laboratoire" n="5">Laboratoire lorrain de recherche en informatique et ses applications</orgName>
<orgName type="university">Université de Lorraine</orgName>
<orgName type="institution">Centre national de la recherche scientifique</orgName>
<orgName type="institution">Institut national de recherche en informatique et en automatique</orgName>
</affiliation>
</author>
</analytic>
<series>
<title level="j" type="main">Computer speech & language : (Print)</title>
<title level="j" type="abbreviated">Comput. speech lang. : (Print)</title>
<idno type="ISSN">0885-2308</idno>
<imprint>
<date when="2009">2009</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
<seriesStmt>
<title level="j" type="main">Computer speech & language : (Print)</title>
<title level="j" type="abbreviated">Comput. speech lang. : (Print)</title>
<idno type="ISSN">0885-2308</idno>
</seriesStmt>
</fileDesc>
<profileDesc>
<textClass>
<keywords scheme="KwdEn" xml:lang="en">
<term>Computational linguistics</term>
</keywords>
<keywords scheme="Pascal" xml:lang="fr">
<term>Linguistique informatique</term>
</keywords>
</textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">Automatic speech recognition (ASR) has reached a very high level of performance in controlled situations. However, the performance degrades drastically when environmental noise occurs during recognition. Nowadays, the major challenge is to reach a good robustness to adverse conditions. Missing data recognition has been developed to deal with this challenge. Unlike other denoising methods, missing data recognition does not match the whole data with the acoustic models, but instead considers part of the signal as missing, i.e. corrupted by noise. The main challenge of this approach is to identify accurately missing parts (also called masks). The work reported here focuses on this issue. We start from developing Bayesian models of the masks, where every spectral feature is classified as reliable or masked, and is assumed independent of the rest of the signal. This classification strategy results in sparse and isolated masked features, like the squares of a chess-board, while oracle reliable and unreliable features tend to be clustered into consistent time-frequency blocks. We then propose to take into account frequency and temporal dependencies in order to improve the masks' estimation accuracy. Integrating such dependencies leads to a new architecture of a missing data mask estimator. The proposed classifier has been evaluated on the noisy Aurora2 (digits recognition) and Aurora4 (continuous speech) databases. Experimental results show a significant improvement of recognition accuracy when these dependencies are considered.</div>
</front>
</TEI>
<affiliations>
<list>
<country>
<li>France</li>
</country>
<region>
<li>Grand Est</li>
<li>Lorraine (région)</li>
</region>
<settlement>
<li>Nancy</li>
</settlement>
<orgName>
<li>Centre national de la recherche scientifique</li>
<li>Institut national de recherche en informatique et en automatique</li>
<li>Laboratoire lorrain de recherche en informatique et ses applications</li>
<li>Université de Lorraine</li>
</orgName>
</list>
<tree>
<country name="France">
<region name="Grand Est">
<name sortKey="Demange, Sebastien" sort="Demange, Sebastien" uniqKey="Demange S" first="Sébastien" last="Demange">Sébastien Demange</name>
</region>
<name sortKey="Cerisara, Christophe" sort="Cerisara, Christophe" uniqKey="Cerisara C" first="Christophe" last="Cerisara">Christophe Cerisara</name>
<name sortKey="Haton, Jean Paul" sort="Haton, Jean Paul" uniqKey="Haton J" first="Jean-Paul" last="Haton">Jean-Paul Haton</name>
</country>
</tree>
</affiliations>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Wicri/Lorraine/explor/InforLorV4/Data/Main/Exploration
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 003B74 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 003B74 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Wicri/Lorraine
   |area=    InforLorV4
   |flux=    Main
   |étape=   Exploration
   |type=    RBID
   |clé=     Francis:09-0009467
   |texte=   Missing data mask estimation with frequency and temporal dependencies
}}

Wicri

This area was generated with Dilib version V0.6.33.
Data generation: Mon Jun 10 21:56:28 2019. Site generation: Fri Feb 25 15:29:27 2022